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Abstract. Let n be a positive integer and X = [xij]i<ij<n be an n x n 
sized matrix of independent random variables having joint uniform dis- 
tribution 

Pr{xij = k for 1 < k < n} ;^ (1 < i, j < n) . 

A realization M = [raij] of X is called good, if its each row and each col- 
umn contains a permutation of the numbers 1 , 2, . . . , n. We present and 
analyse four typical algorithms which decide whether a given realization 
is good. 



1 Introduction 

Some subsets of the elements of Latin squares [1, 13, 23, 29, 32, 53, 54, 59, 60], 
of Sudoku squares [6, 7, 15, 16, 20, 21, 22, 28, 31, 45, 50, 55, 57, 60, 62, 
65, 66, 69, 71], of de Bruijn arrays [2, 3, 4, 5, 10, 11, 18, 26, 27, 35, 38, 39, 
42, 44, 48, 52, 56, 61, 64, 68, 70, 72] and gerechte designs, connected with 
agricultural and industrial experiments [7, 8, 34] have to contain different 
elements. The one dimensional special case is also studied is several papers 
[30, 33, 36, 37, 38, 40, 41, 46, 47, 49]. 
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The testing of these matrices raises the fohowing problem. 
Let m > 1 and n > 1 be integers and X = [^ii]i<i<m,i<j<n be an m x n sized 
matrix of independent random variables having joint uniform distribution 

Pr{xy = k for 1 < k < n} = - (1 < i < m, 1 < i < n) . 
' n 

A realization A4 = [mij] of X is called good, if its each row and each column 
contain different elements (in the case m = n a permutation of the numbers 
1 , 2, . . . , n. We present and analyse algorithms which decide whether a given 
realization is good. If the realization is good then the output of the algorithms 
is True, otherwise is False. 

The structure of the paper is as follows. Section 1 contains the introduc- 
tion. In Section 2 the mathematical background of the main results is prepared. 
Section 3 contains the running times of the testing algorithms Linear, Back- 
ward, Bucket and Matrix in worst, best and expected cases. In Section 4 
the results are summarised. 

2 Mathematical background 

We start with the first step of the testing of Ai : describe and analyse several 
algorithms testing the first row of A4. The inputs of these algorithms are 
n (the length of the first row of Ai) and the elements of the first row m = 
(mn , mi2, . . . , Tain) - For the simplicity we use the notation s = (si , S2, . . . , Sn)- 
The output is always a logical variable g (its value is True, if the input 
sequence is good, and False otherwise). 

We will denote the binomial coefficient (]J) by B(n, k) and the function log2 n 
by Ign [19], and usually omit the argument n from the functions T(n), o"(n), 
K(n), Ki(n), K2(n), y(n), A(n), 6(n), a(n), |x(n),ri(n), 4)(n),p(n), |3(n), 
Si(n), Ri(n), Q(n),pi,(n), y(n), qi(k,n), Ai(n), bj(n), f(n), p(i,j,k,n), 
Cj(Ti), c(rL), and A(ii,i2,k,Ti). 

We characterise the running time of the algorithms by the number of neces- 
sary assignments and comparisons and denote the running time of algorithm 
Alg by TvyorstttT-, Alg), TbestlTL, Alg) and Texp(TL, Alg) in the worst, best, 
resp. expected case. The numbers of the corresponding assignments and com- 
parisons are denoted by A, resp. C. The notations O, O, 0, o and cu are used 
according to [19, pages 43-52] and [51, pages 107-110]. 

Before the investigation of the concrete algorithms we formulate several 
lemmas. The first lemma is the following version of the well-known Stirling's 
formula. 
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Lemma 1 ([19]) //n > 1 then 

Ti\ = [^yV2^e\ (1) 

where 

^ 1 

12n+ 1 Tin' 

and T(n) = t tends monotonically decreasing to zero when n tends to infinity. 

Let ak(n) = ai^ and Si(n) = St defined for any positive integer n as follows: 
ak = ^ (1< = 0, 1, 2, ...), 

n-l 

Si = ^aklc^ (t = 0, 1, 2, ...). (2) 

k=0 

If in (2) k = i = 0, then = 0. 

Solving a problem posed by S. Ramanujan [63], Gabor Szego [67] proved 
the following connection between and So- 

Lemma 2 ([67]) The function cr(rL) = a, defined by 

y = So+ + an = X^+ (^- + cTj an (n = 1, 2, ...) (3) 

and 

cr(0) = l, 

tends monotonically decreasing to zero when n tends to oo. 
The following lemma shows the connection among and Sq, Si , . . . , St_i . 

Lemma S If i and n are positive integers, then 

i-l 

S, = n^B(i-l,k)Su-n^-^an-i (4) 

k=0 

and 

Si = e(eV) . (5) 
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Proof. Omitting the member belonging to the index k = in St, then simph- 
fying by k and using the substitution k — 1 = j we get 

n— 1 ]^ n— 1 ]^_] n— 2 ; 

k=0 k=l ^ ' 1=0-' 

Completing the sum with the member belonging to index j = n — 1 results 

n-l j 

Si = n^^(j + ir^-n^an_i. (6) 
j=o ^• 

Now the application of the binomial theorem results (4). 

According to (5) Sq = @[e^), so using induction and (6) we get (5). □ 

In this paper we need only the simple form of So, Si, Sz and S3 what is 
presented in the next lemma. 

Lemma 4 If n is a positive integer then 

Si = nSo -nan-1, S2 = So(n^ + n) - 2n^an , (8) 

and 

S3 = So(rL^ + 3n^ + rL) - (3n^ + 2n^)an. (9) 

Proof. Expressing Sq from (3), and using recursively Lemma 3 for i = 1, 2 
and 3 we get the required formula for Sq, Si, S2, and S3. □ 

We introduce also another useful function Ri(n) = 

n 

Ri = ^Pk(n]k^ (i = 0, 1, 2, ...), (10) 
where PklTi.) = Pk is the key probability of this paper, defined in [33] as 



nn— 1 n — k+lk n!k , , ^ ^ , ^ 

Pk = = 7 TXTTTT ^ = 1, 2, n). 11 



The following lemma mirrors the connection between the function R^ and 
the functions So, Si , . . . , Si+i . 
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Lemma 5 If i and n are positive integers, then 

i+1 



(12) 

1=0 ^ ^ 



Proof. Using (10) and (11) the substitution n — k = j results 

k=l ^ j=o 
From here, using the binomial theorem we get (12). □ 
In this paper we need only the following consequence of Lemma 5. 

Lemma 6 If n is a positive integer, then 

n! 

and ^ 

R2 = 2n-^So. (13) 

Proof. Rq = follows from the definition of the probabilities pi<- Substituting 
I = 1 into (12) we get 

n— 1 i n— 1 1 n— 1 , 

> w — r\i w — r\i 2 



From here, using (2) we get 



\ j=o j=o ' i=0 ' 



^1 =^(n'So-2nSi +S2), 

and using (6) the required formula for Ri . 
Substituting i = 2 into (12) we get 

(n— 1 ; n— 1 ; TL— 1 -i n— 1 ; 

j=o j=o j=o >- i=o >■ 
From here, using (2) we have 

n! 



R2 = r7^TT(^^o-3n^Si +3nS2-S3), (14) 



n 



and using (8) and (9) the required formula for R2. □ 
The following lemmas give some further properties of Ri and R2. 
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Lemma 7 If n is a positive integer, then 

Inn 1 
3 



Ri = ^So 




where 



K n = K 




+ K, 



2cTe^ 



(15) 



(16) 



and K tends monotonically decreasing to zero when n tends to infinity. 
Proof. Substituting So according to (7) in the formula (13) for Ri we get 
n! Te'^ TL^ /I M 1 n! /e" n'^ 



Ri = — 















T 


~ ttJ 





3 V 2 n! 



(17) 



Substitution of n! according to (1) (Stirling's formula) and writing 1 + (6"^— 1 ' 
instead of results 



Ri 



3 V e / 



The product P of the expressions in the square brackets is 



P = y + y (e^-l)-cre% 



therefore 
implying 
Let 

Ki(n) = Ki 
and 




7tn 1 V2.nn 
- T + — — 



nn 1 Inn , _ 





nn 2<3e^ 




2 



(18) 



(19) 



(20) 



(21) 



TTTL 




e"^ - 1), K2(rL) = K2 




2 



, K = Kl + K2 , (22) 



K(n+1) Ki(n+1)-K2(n+1) 
y [nj = y = ;-— ; — = for n = 1 , 2, . . . . (23) 



K n 



Kilnj - K2l,nj 
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Since all k functions are positive for all positive integer n's, therefore y < 1 
for n > 1 implies the monotonity of k. Numerical results in Table 1 show that 
y < 1 for n = 1 , 2, . . . , 9, therefore it remained to show y < 1 for n > 1 0. 

Klin + 1) can be omitted from the numerator of (22). Since ct and t are 
monotone decreasing functions, and < o[5) < 0.0058, and < e"^'^' < 1 .02, 
and TL^ < for n > 10, therefore 

2o-e^ 2 • 0.0058 • 1 .02 0.012 , 

< < — ^ for n > 10 . (24) 

Using (23), (24) and the Lagrange remainder of the Taylor series of the 
function e" we have 

/fT + T T(n+l)+T^£,n+l/2 

T(n) + T2£,^/2-0.012/n2 ' 
where < f,n+i < n. + 1 and < £,n < n., therefore using Lemma 1 we get 

y<^^ . (25) 

\/rL 1 I W M 

12n 2 U2n/ 

Now multiplication of the denominator and denominator of the right side of 

(25) by (12n)2 results 

Vn + 1 ii^+^3 ^ Vn + 1 12n 

12n + 0.5 - 1.584 (i2n - 1 .084) (l + ' ^ ^ 

Since 

f17n - 1 0841 I 1 + 

12n 

(26) and (27) imply 

Vl44n3 + 144n2 ^ 

y < , < 1 , 

Vl44n3 +240n2 

finishing the proof of the monotonity of k. □ 

We remark, that the monotonity of k was published in [40] without proof, and 
was proved by E. Bokova and G. Tzaturjan in 1985 [9], and in 1988 — using 
a formula due to E. Egorychev et al. [25] derived by the method of integral 
representation of combinatorial sums elaborated by E. P. Egorychev [24] — by 
T. T. Cirulis and A. Ivanyi [17]. Our proof is much simpler than the earlier 
ones. 



12n- 1.084) ( 1 +-1^ ) > 12n + 10, (27) 
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Lemma 8 If n is a positive integer, then 




[mi ^ 



where 



(28) 



and A tends monotonically decreasing to zero when n tends to infinity. 
Proof. The proof is omitted since it is similar to the proof of Lemma 7. □ 

3 Running times of the algorithms 

In the following analysis let n > 1 and let x = (xi , xz, . . . , Xn) be independent 
random variables having uniform distribution on the set {1,2,..., n}. The input 
sequence of the algorithms is s = (si , S2, . . . , Sn) (a realization of x). 

We derive exact formulas for the expected numbers of comparisons Cgxp 
Linear) = Cl, Cexp(rL, Backward) = Cw, and Cexp (tt-, Bucket) = Cb, fur- 
ther for the expected running times Tgxp (ti. Linear) = Tl, Texp(Ti, 
Backward) = Tw, and Texp(Ti., Bucket) = Tb. 

The inputs of the following algorithms are n (the length of the sequence s) 
and s = (si , S2, . . . , Sn), a sequence of nonnegative integers with 1 < St < n 
for 1 < i < n) in all cases. The output is always a logical variable g (its value 
is True, if the input sequence is good, and False otherwise). The working 
variables are usually the cycle variables i and j. 

We use the pseudocode defined in [19]. 

3.1 Definition and running time of algorithm LINEAR 

Linear writes zero into the elements of an n length vector v = (vi, V2, 
Vn), then investigates the elements of the realization s and if v^^ > 
(signalising a repetition), then returns False, otherwise adds 1 to Vk- If Lin- 
ear does not find a repetition among the elements of s then it returns finally 
True. 

LiNEAR(n, s) 

1 g <— True 

2 for i <— 1 to n 



3 
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4 for I <— 1 to n 

5 if > 

6 g <— False 

7 return g 

8 else Vs^ <- v^^ + 1 

9 return g 

Linear needs assignments in lines 1, 3, and 8, and it needs comparisons in 
line 5. The number of assignments in lines 1 and 3 equals to rL + 1 for arbitrary 
input and varies between 1 and n in line 8. The number of comparisons in line 
8 also varies between 1 and n. Therefore the running time of Linear is 0(n) 
in the best, worst and expected case too. 

The following theorem gives the expected number of the comparisons of 
Linear. 



Theorem 9 The expected number of comparisons Cexp (tl, Linear) = Cl of 
Linear is 

^ , tl! „ Inn 2 n! , , 

Cl = 1- — + Ri = J-r + T + K- — . 29 

where 

1 Inn sr- 

3 ~ V^^^ (n-k)!n^+i 

tends monotonically decreasing to zero when n tends to infinity. 



Proof. Let 



y (n) = y = max{k : 1 < k < n and si , S2, . . . , Si^ are different} (30) 

be a random variable characterising the maximal length of the prefix of s 
containing different elements. Then 

Pr{y = k} = pi, (k = 1 , 2, . . . , n) , 

where is the probability introduced in (11). 

If y = k and 1 < k < n — 1 , then Linear executes k + 1 comparisons, and 
only TL comparisons, if y = n, therefore 

n— 1 n I n 

Cl = ^Pk(lc+1) +Pnn = ^Pk(k + 1) -pn = 1 - ^ + ^Pkk, (31) 

k=l k=l k=l 
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from where using Lemma 7 we receive 

^ - tl! Inn 2 n! , , 

The monotonity of K(n) was proved in the proof of Lemma 7. □ 
The next assertion gives the running time of Linear. 

Theorem 10 The expected running time TexplTi, Linear) = T]_ of Linear is 

7 , ,n! 



Tl = n + V2^+ - + 2k - 2— , 
3 

where k tends monotonically decreasing to zero when n tends to infinity. 

Proof. Linear requires n+l assignments in hues 01 and 03, plus assignments 
in Une 08. The expected number of assignments in hne 8 is the same as Cl. 
Therefore 

Tl = n + 1 + 2Cl . (33) 

Substitution of (32) into (33) results the required (29). □ 
We remark, that (32) is equivalent with 

^ , n! , n — 1 n — In — 2 n — In — 2 1 

Cl = 1 - — + 1 + + + • • • + , 

n^ n n n n n n 

demonstrating the close connection with the function 

Q(n) = Q = Cl - 1 + ^ , (34) 

studied by several authors, e.g. in [12, 40, 51]. 

Table 1 shows the concrete values of the functions appearing in the analysis 
of Cl and Tl for 1 < n < 10, where Cl was calculated using (32), k using (11), 
and a using (3) (data in this and further tables are taken from [43]). We can 
observe in Table 1 that 5(n) = 6 = k — is increasing from n = 1 to n = 8, 
but for larger n is decreasing. Taking into account that for n > 8 



\e/ n^ n^ 

holds, we can prove — using the same arguments as in the proof of Lemma 
7 — the following assertion. 
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n 


Cl 


u 


n!/n^ 


K 


5 


a 


1 


1 .000000 


1.919981 


1 .000000 


0.080019 


-0.919981 


0.025808 


2 


2.000000 


2.439121 


0.500000 


0.060879 


-0.439121 


0.013931 


3 


2.666667 


2.837470 


0.222222 


0.051418 


-0.170804 


0.009504 


4 


3.125000 


3.173295 


0.093750 


0.045455 


-0.048295 


0.007205 


5 


3.472000 


3.469162 


0.038400 


0.041238 


+0.002838 


0.005799 


6 


3.759259 


3.736647 


0.015432 


0.038045 


+0.022612 


0.004852 


7 


4.012019 


3.982624 


0.006120 


0.035515 


+0.029395 


0.004170 


8 


4.242615 


4.211574 


0.002403 


0.033444 


+0.031040 


0.003656 


9 


4.457379 


4.426609 


0.000937 


0.031707 


+0.030770 


0.003255 


10 


4.659853 


4.629994 


0.000363 


0.030222 


+0.029859 


0.002933 



Table 1: Values of Cl, u = nn/l + 2/3, n!/n^, k, 6 = k — nl/n^, and cr for 
n = 1, 2, 10 



Theorem 11 The expected running time Tg^p (n.. Linear) = Tl of Linear is 



7 



Tl = n + Vlnn + - + 5 , 

where 6(rL) = 8 tends to zero when n tends to infinity, further 

6(n + 1 ) > 8(n) /or 1 < n < 7 and 8[n + ]) < 5(n) forn>8. 

If we wish to prove only the existence of some threshold index no having the 
property that n > no implies 5(n+ 1) < 5(n), then we can use the following 
shorter proof. 

Using (29) and (34) we get 

^ /7m 2 n! [nn 1 

-CL-y'----- = Q-^- + -. (35) 

Substituting the power series 

7tn 1 ^ n 14 1 7t 




Q = ^/T-3 + T22^-T35^ + 2882^ + °("^^ 
cited by D. E. Knuth [51, Equation (25) on page 120] into (35) and using 

nk/2 (n+l)V2 -® l^ni+V2 
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for k = 1 , 2, 3 and 4 we get 



K(n) - K(n + 1 ) = ( ^ - ) + 0{n-^) , 

implying 

K(n) - K(n + 1 ) = ^fl ^ , - -^ + 0(n-2) , 

guaranteeing the existence of the required tlq. 

3.2 Running time of algorithm BACKWARD 

Backward compares the second (si), third (S3), . . . , last (sn) element of the 
realization with the previous elements until the first collision or until the last 
pair of elements. 

Taking into account the number of the necessary comparisons in line 04 
of Backward, we get Cbest(TL> Backward) = 1 = 6(1), and Cworst(n-, 
Backward) = B(n, 2) = @[n^). The number of assignments is 1 in the best 
case (in line 1) and is 2 in the worst case (in lines 1 and in line 5). The expected 
number of assignments is Aexp('n., Backward) = 1 + :;^) since only the good 
realizations require the second assignment. 

Backward (n,s) 

1 g <— True 

2 for i <— 2 to n 

3 for j <— I — 1 downto 1 

4 if Si = Sj 

5 g <— False 

6 return g 

7 return g 

The next assertion gives the expected running time. 

Theorem 12 The expected number of comparisons Cexp(TL) Backward) = 
Cw of the algorithm Backward is 

„ Inn 2 1 n!n+l Inn 2 

where a(n) = ct = j + ^'^^^^ monotonically decreasing tends to zero when n 
tends to 00. 
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Proof. Let y be as defined in (30), as defined in (11), and let 

z = {q : 1 < q < 1^; si , S2, . . . , Sk are different; s^+i = Sq | y = k} 

be a random variable characterising the index of the first repeated element of 
s. 

Let 

qi(k, n) = qi(k) = Pr{z = i|y = k} (k = 1 , 2, . . . , n; I = 1 , 2, . . . k) . 

Backward executes B(k, 2) comparisons among the elements si , S2, . . . , Sk, 
and Sk+i requires at least 1 and at most k comparisons (with exception of case 
k = n when additional comparisons are not necessary). Therefore using the 
theorem of the full probability we have 

n-l / k \ 

Cw = Xpk B(k,2) + ^iqi(k) +pnB(n,2), 

k=l V i=l / 

where 

q^(k,n) = qi(k] = l (i = 1 , 2, . . . , k; k = 1 , 2, . . . , n) . (36) 
Adding a new member to the first sum we get 

n / k \ n 

Cw = }^Pk B(k,2) + ^qi(k)i -pn^qi(k)l. (37) 

k=l V i=l / i=l 

Using the uniform distribution (36) of z we can determine its contribution to 

^q,(k)l = ^^ = ^i±I. (38) 

Substituting the contribution in (38) into (37), and taking into account Lemma 
6 and Lemma 7 we have 

1 1 n!n + l 
Cw = — Ri — :rRo — • 

2 2 2 

Now Lemma 6 and Lemma 7 result 

[rm 2 1 n! n + 1 
Cw = n-^- + ---K--^. (39) 

The known decreasing monotonity of k and ^ imply the decreasing mono- 
tonity of a. □ 
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TL 


Cw 


n - y^nn/8 + 2/3 


t 


K 


a 


1 


0.000000 


1.040010 


1 .000000 


0.080019 


1.040010 


2 


1 .000000 


1 .780440 


0.750000 


0.060879 


0.780440 


3 


2.111111 


2.581265 




0.051418 


0.470154 


4 


3.156250 


3.413353 


0.234375 


0.045455 


0.257103 


5 


4.129600 


4.265419 


0.115200 


0.041238 


0.135819 


6 


5.058642 


5.131677 


0.054012 


0.038045 


0, 073035 


7 


5.966451 


6.008688 


0.024480 


0.035515 


0.042237 


8 


6.866676 


6.894213 


0.010815 


0.033444 


0.027536 


9 


7.766]59 


7.786695 


0.004683 


0.031707 


0.020537 


10 


8.667896 


8.685003 


0.001996 


0.030222 


0.017107 



Table 2: Values of Cw, n - y^7m/8 + 2/3, t = k, and oc = k/2 + 

(rL!/n^)((n + l)/2] for n = 1, 2, 10 



Theorem 13 The expected running time Tgxp (tl, Backward] = Tw of the 
algorithm Backward is 




where a = k/2 + (n!/rL"^)((rL + l)/2) tends monotonically decreasing to zero 
when n tends to oo. 

Proof. Taking into account (39) and Aexp(rL, Backward) = 1 + — 

we get (40). □ 

Table 2 represents some concrete numerical results. It is worth to remark 
that = ) while k = @ > therefore k decreases much slower 

than the other expression. 

3.3 Running time of algorithm BUCKET 

Bucket divides the interval [l,n] into m = -y/n subintervals Ii, I2,..., Imi 
where Ij = [(j — l)m+ IJra] for j = 1, 2, ... ra, and sequentially puts the 
elements of s into the bucket Bj (we use the word bucket due to some similarity 
to bucket sort [19]): if [s^/m] = j, then si belongs to Bj. Bucket works until 
the first repetition (stopping with g = False), or up to the processing of the 
last element Sn (stopping with g = True). 
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Bucket handles an array Q[l : m, 1 : m] (where m = \\/rC\ and puts the 
element St into the rth row of Q, and it tests using linear search whether 
Sj appeared earlier in the corresponding bucket. The elements of the vector 
c = (ci , C2, . . . , Cra) are counters, where Cj (1 < j < m)) shows the actual 
number of elements in Bj. 

BuCKET(n, s) 



1 9^ 


- True 




2 m f 






3 for 


j <— 1 to m 




4 






5 for 


I <— 1 to n 




6 


r <- \si/m\ 




7 


for j <— 


1 to Cr — 1 


8 


if 


Si = Qr,j 


9 




g <— False 


10 




return g 


11 


Qr,c. ^ 


Si 


12 


Cr <- Cr 


+ 1 



13 return g 

For the simplicity let us suppose that ra is a positive integer and n = m^. 

In the best case si = S2. Then Bucket executes 1 comparisons in line 8, 
m assignments in line 4, and 1 assignment in line 1, 1 in line 2, 2 in line 6, 
and 1 in line 8, 11 and 12, therefore TbestlTL, Bucket) = m + 7 = 0[^/n). 
The worst case appears, when the input is bad. Then each bucket requires 
l+2 + -- - + m— 1 = B(n— 1,2] comparisons in line 8, further 3ra assignments 
in lines 6, and 12, totally ^ +3ra^ operations. Lines 1, 2, and 9 require 
1 assignment per line, and the assignment in line 4 is repeated m times. So 
T^orst(u, Bucket) = Iliintdl + s^a^ + m + 3 = &[tx^/^). 

In connection with the expected behaviour of Bucket at first we show that 
the expected number of elements in a bucket has a constant bound which is 
independent from n. 

Lemma 14 Let bj (n) = bj (j = 1 , 2, . . . , m) be a random variable char- 
acterising the number of elements in the bucket Bj at the moment of the first 
repetition. Then 



(41) 
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where 

^(n) = ^ = - ^ , (42) 
3/rL /a 

and [L tends monotonically decreasing to zero when n tends to infinity. 

Proof. Due to the symmetry of the buckets it is sufficient to prove (41) and 
(42) for j = 1 . 

Let TTL be a positive integer and n = m^. Let y be the random variable 
defined in (28) and pic be the probabihty defined in (11). 

Let Ai(n) = (i = 1, 2, . . . , n) be the event that the number i appears 
in s before the first repetition and Y|(n) = be the indicator of A^. Then 
using the theorem of the full probability we have 

m m 

E{bi} = ^\ = Y. P^^'^^^ = TT^Pr{Ai} 

i=l 1=1 

and 

Pr{Ai} = Pr{1 G{si, S2, s^Hy =k}= Vp^- = - )~pkk= -R, 

^ — n n ^ — n 

Using Lemma 7, we get 

■r^r, 1 1 TTi / /7rn 1 
E{bi} = m-Ri = - J— - - + K 
TL n W 2 3 

resulting (41) and (42). 

We omit the proof of the monotonity of |J., since it is similar to the corre- 
sponding part in the proof of Lemma 7. □ 

Table 3 shows some concrete values. 

Lemma 15 Let f(n) = i he a random variable characterising the number of 
comparisons executed in connection with the first repeated element. Then 

where 



n n ^ — n 



V6 + y7t78-K/2 
n(n) = r| = W—-, > 

and r\ tends monotonically decreasing to zero when n tends to infinity. 
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TL 


E{bi} 


y/n/2 




k/^/tl 


1^ 


1 


1 .000000 


1.253314 


0.333333 


0.080019 


0.253314 


2 


1 .060660 


1.253314 


0.235702 


0.043048 


0.192654 


3 


1 .090055 


1.253314 


0.192450 


0.029686 


0.162764 


4 


1.109375 


1.253314 


0A66667 


0.022727 


0.143940 


5 


1.122685 


1.253314 


0.149071 


0.018442 


0.130629 


6 


1.132763 


1.253314 


0.136083 


0.015532 


0.120551 


7 


1.140740 


1.253314 


0.125988 


0.013423 


0.112565 


8 


1.147287 


1.253314 


0.117851 


0.011824 


0.106027 


9 


1.152772 


1.253314 


0.111111 


0.010569 


0.100542 


10 


1.157462 


1.253314 


0.105409 


0.009557 


0.095852 



Table 3: Values of E{bi}, yW^, ^/[3^/n), k/^/K, and |x = ^/[3^/n) - k/Vtl 
for rL = 1, 2, 10 



Proof. Let p(i, j,k, n) = p(l, k, u) be the probability of the event that there 
are k different elements before the first repetition, and the repeated element 
belongs to Bj, and Bj contains i elements in the moment of the first repetition. 
Due to the symmetry p(i, j,k, n) does not depend on j and 



since we investigate n^^^ sequences, and if there are k (1 < k < n) different 
elements before the repeated one, then we can choose i elements for the jth 
bucket in (^T^) (^-7^) manner, we can permute them in k! manner, and we can 
choose the repeated element in i manner. Then 

E{f } = ^ p (i, j , k) ^ - mpn (43) 



i,j,k,n 



m A k! / m\ /n — m\ ... n + 1 ,^ ^, 

= 2^L^L(0(k-iJ^^^+^)-^'^^ 

k=l i=l ^ ^ ^ ^ 

The last member of the formula takes into account that if k = n, then 
additional comparisons with the elements of the bucket corresponding to the 
repeated element are not necessary. 

Let 

E'{f} = E{f} + pn^. 
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Then dividing the inner sum in (44) by (]J) we get the expected value of 
the random variable ^(^, + 1), where £, has hypergeometric distribution with 
parameters tl, m, and k. It is easy to compute that 

EU(^ + 1 )} = E'{y (EU + 1 }] + Var{y = " 1 ) + (2n - 1 - m)] 



n(n — 1 ) 

therefore 

T?Hn m A k! /n\km[k(m-1) + (2n-1 -m)] 
EW=2^I.^y ^^(^) (45) 

1 

m-1 2n-l-m 2m + 1+Ri 

-Ri H — — — — = — — (46) 



2(n-l) ' 2(n-l) 2m + 2 



^ /^_ 1/6+v/^-K/2 

The convergence and monotonicity of ri is the consequence of the properties of 
K. Taking into account the small value of pn (see equation (11)) the difference 
E'{f} — E{f} has negligible influence on the limit of E{f}. □ 

Theorem 16 The expected number of comparisons Cexp (n-) Bucket) = Cb 
of Bucket is 

Cb = V^+^-^l+P, (48) 

where 



, , 5/6- J97i/8-3k/2 , , 

y n + I 

and p tends monotonically decreasing to zero when n tends to infinity. 

Proof. Let s = (si, S2, Sn) be the input sequence of the algorithm 

Bucket. Bucket processes the input sequence using m = ^/rl buckets Bi , Bi, 
...,Bn: it investigates the input elements sequentially and if the i-th input 
element si belongs to the interval [(r — 1 )m + 1 , (r — 1 )m + 2, . . . , rm], then 
it sequentially compares Si with the elements in the bucket Br and finishes, if 
it finds a collision, or puts into Bt, if differs from all elements in B,.. 



Testing of sequences by simulation 



117 



Let y be the random variable, defined in (30), and pk the probability 
defined in (11). Let b^^ be the random variable defined in Lemma 14, and 
Cj (tl) = Cj (j = 1 , 2, . . . , m) be a random variable characterising the number 
of comparisons executed in Bj before the processing of the first repeated ele- 
ment, and c(n) = c a random variable characterising the number of necessary 
comparisons executed totally by Bucket. Then due to the symmetry we have 



The probability of the event A(li , i2, k, n) = A(ii,i2,k) that the elements 
i] and ii (1 < i-i, ii ^ m-) will be compared before the processing of the first 
repeated element at the condition that y = k and 2 < k < n equals to 



Cb =E<^ }^Cj I +E{f} = mE{ci} + E{f}. 




(50) 



Pr{A(li , i2, k)|-y = k and 2 < k < n} = 




Since there are (^) possible comparisons among the elements of the interval 
[1 , m] , we have 




from where using Lemma 7 and Lemma 8 we get 



E{ci} 



2n2 - 2n 



n — ^/rL 



(R2-R1) 



2n + 2Vn 



1 



2n-2 




(51) 



This equality implies 




(52) 



From (50), taking into account (52), (45), and (47) we get 



Cb = \/rL+ - 




17 + 



y9^ + 5/6-3K/2 



Denoting the last fraction by p we get the required (48). The monotonity of p 
is the consequence of the monotonity of k. □ 
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Theorem 17 The expected running time Tgxp (n, Bucket) = Tb of Bucket 
is 

18 = ^/^^(3 + 3^1) 4), (53) 

where 



, , . , , , n! 3y7^-1/3-3K/2 
4)(n) = ct) = 3K-p-3Ti ^ ^—j , 

and 4) ienrfs to zero when n tends to infinity. 

Proof. Bucket requires 2 assignments in lines 1 and 2, ^/n assignments in 
line 4, Ri assignments in line 6, CB + E{f} assignments in line 8, 1 — Pn expected 
assignment in line 9 and 2Ri assignments in lines 11 and 12 before the first 
repeated element, and 2E{f} — 1 assignments after the first repeated element. 

Therefore the expected number AgxplTi, Bucket) = Ab of assignments of 
Bucket is 

Ab = 2 + + 3Ri + Cb + 3E{f} - ^ . 

Substituting Ri, and Cb, and E{f} we get 

^ 13 ^ Inn ^ pn ^ pn ^ n\ , ^. 

Ab =2/fT+y +3J— + 3k- J- + P + 3J--3T1- — , (54) 



implying 



Ab = ( 2 + 3 + y + J| + 3 K + p - 3T1 - ^ 



Summing up the expected number of comparisons in (48) and of assignments 
in (54) we get the final formula (53). □ 



3.4 Test of random arrays 

Matrix is based on Bucket. 

For the simplicity let us suppose that n is a square. 

Let be an n X n sized matrix, where my €{1,2,..., n}. The ith row of 
M is denoted by Ti, and the jth column by Cj for 1 < i, j < n. The matrix M 
is called good, if its all lines (rows and columns) contain a permutation of the 
elements 1 , 2, . . . , n. 
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MATRix(n, 7W) 

1 g <— True 

2 BuCKET(n, ri) 

3 if g = False 

4 return g 

5 for i <— 2 to n 

6 BUCKET(rL, Ti) 

7 if g = False 

8 return g 

9 for j <— 1 to n 

10 BuCKET(rL, Cj) 

11 if g = False 

12 return g 

13 return g 

Theorem 18 The expected running time Texp (tl, Matrix) = Tjvi of Matrix 
is 

Tm = Tb + o(1). (55) 
Proof. According to Theorem 17 we have 



TB = Vi^(3 + 3y^j+^^ + o(l). 

Since the rows of M are independent, therefore the probability of the event 
Gk(TL) = Gk (k = 1 , 2, . . . , n) that the first k rows are good is 



Pr{Gk} 



n" 



so for the expected time Tgxp (tl, Matrix) = Tr of the testing of the rows we 
have 

n-l / , X k 



Tr<Tb + TbX(^) =Tb + o(1: 



k=l 

Since the columns are also independent, all the rows and the first k columns 
are good with the probability 



, , n+k 
TL! 
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Index and algorithm 


Cbest(ri) 


Cworst(T^) 




1. Linear 


e(i) 


G(n) 




2. Backward 


G(l) 






3. Bucket 


G(l) 






4. Matrix 


e(i) 







Table 4: The expected number of comparisons of the investigated algorithms 
in best, worst and expected cases 



Index and algorithm 




TwoTst(Tl-) 


Texp (n) 


1. Linear 


Sin) 


e(n) 




2. Backward 


e(i) 




0(n) 


3. Bucket 








4. Matrix 


0{V^] 







Table 5: The running times of the investigated algorithms in best, worst and 
expected cases 



and so for the expected time of testing of the columns Tgxp (ti, Matrix) = Tc 
holds 

n-l 



Tc<TbL(^)=o(1), 

1. — n \ / 



k=0 

and so 

Tm = Tr + Tc 

implies (55). □ 



4 Summary 

Table 4 summarises the basic properties of the number of necessary compar- 
isons of the investigated algorithms. 

Table 5 summarises the basic properties of the running times of the inves- 
tigated algorithms. 

We used in our calculations the RAM computation model [19]. If the inves- 
tigated algorithms run on real computers then we have to take into account 
also the limited capacity of the memory locations and the increasing execution 
time of the elementary arithmetical and logical operations. 
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